Day 08 Azure cognitive service: object detection- 匡出照片中的喵

2021 iThome 鐵人賽

DAY 8

AI & Data

我不太懂 AI，可是我會一點 Python 和 Azure系列第 8 篇

13th鐵人賽 microsoft azure data science with azure azure object detection

Ben

團隊能去健身房後發現硬舉退步一百公斤的五隻雞

2021-09-08 07:32:49

5244 瀏覽

分享至

Azure cognitive service: Object Detection- 匡出照片中的喵

物體偵測 (Object Detection)

物體偵測主要就做兩件事情：

偵測物體位置，如下圖三個方框。
判斷物體為何 (影像辨識- Image Classification)，如下圖分別辨識出三個方框中的物體為何。

物體偵測算是一個相對成熟的技術，現在主要手段還是透過深度學習訓練模型，在模型的訓練過程中，會需要分析大量的影像與物體相對應的位置與標示，才能在訓練結束後辨識影像中的物體。詳細的物體辨識介紹有些複雜，這篇文章不一一解說。目前效果最好的物體辨識模型之一 YOLO- You Only Look Once，而且不斷在進化，目前原作者認可的最新版本是 YOLOv4，其論文詳細說明物體偵測的步驟，有興趣的人可以到以下連結研究：

雖然物體偵測的模型有點複雜，但 Azure 有提供自己訓練出來的模型提供大家使用，不需要看論文，也不需要懂原理，只要會用 API 就好，後續就是要教大家如何利用 Azure 電腦視覺服務來偷懶。

申請 Azure 電腦視覺服務

進入https://portal.azure.com/#home
點選建立資源
搜尋並選擇 computer vision
自行命名。
找到可以選擇定價層 Free F0 的區域，並選擇 Free F0。
給予標籤
檢閱 + 建立

安裝`Python`套件

需要用到以下套件：

azure-cognitiveservices-vision-computervision
Pillow
requests

金鑰與端點

取得金鑰 (SUBSCRIPTION KEY) 和端點 (ENDPOINT)
- 進入https://portal.azure.com/#home
- 點選所有資源
- 點選剛剛建立的電腦視覺服務
- 點選金鑰與端點
- 複製金鑰與端點
Azure 電腦視覺的功能都是使用同一組金鑰與端點

示範程式

import os
from io import BytesIO
import requests
from PIL import Image, ImageDraw, ImageFont
from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from msrest.authentication import CognitiveServicesCredentials

# 匯入必要套件，主要都是跟讀檔、繪圖和 Azure 的相關套件 

# 一開始除了匯入套件以外，還需要利用金鑰SUBSCRIPTION_KEY和端點ENDPOINT，取得使用電腦視覺服務的權限。

SUBSCRIPTION_KEY = os.getenv("SUBSCRIPTION_KEY")
ENDPOINT = os.getenv("ENDPOINT")
CV_CLIENT = ComputerVisionClient(
    ENDPOINT, CognitiveServicesCredentials(SUBSCRIPTION_KEY)
)


def main():
    """
    Azure object detection
    """
    
    # 透過圖片的 URL 取得圖片
    url = "https://i.imgur.com/Js5H6Qa.jpg"
    response = requests.get(url)
    img = Image.open(BytesIO(response.content))
    
    # 開始設定繪圖相關的部分，由於會需要在圖片上寫字，需要準備字型檔
    draw = ImageDraw.Draw(img)
    font_size = int(5e-2 * img.size[1])
    fnt = ImageFont.truetype("../static/TaipeiSansTCBeta-Regular.ttf", size=font_size)
    # 透過電腦視覺的功能取得物件，偵測的結果會包含匡出物體的左上角座標(x, y)，以及方匡的寬跟高(w, h)，過這四個值即可畫出方匡，並且標示辨識結果以及辨識的信心程度。
    object_detection = CV_CLIENT.detect_objects(url)
    if len(object_detection.objects) > 0:
        for obj in object_detection.objects:
            left = obj.rectangle.x
            top = obj.rectangle.y
            right = obj.rectangle.x + obj.rectangle.w
            bot = obj.rectangle.y + obj.rectangle.h
            name = obj.object_property
            confidence = obj.confidence
            print("{} at location {}, {}, {}, {}".format(name, left, right, top, bot))
            draw.rectangle([left, top, right, bot], outline=(255, 0, 0), width=3)
            draw.text(
                [left, top + font_size],
                "{0} {1:0.1f}".format(name, confidence * 100),
                fill=(255, 0, 0),
                font=fnt,
            )
    # 最後存檔
    img.save("output.png")
    print("Done!")
    print("Please check ouptut.png")


if __name__ == "__main__":
    main()